Incidence, prevalence, and national burden of interstitial lung diseases in India: Estimates from two studies of 3089 subjects

Background and objective The epidemiology of interstitial lung diseases (ILDs) in developing countries remains unknown. The objective of this study was to estimate the incidence, prevalence, and national burden of ILDs in India. Methods Data of consecutive subjects (aged >12 years) with ILDs included in a registry between March 2015 and February 2020 were analyzed retrospectively. The proportion of each ILD subtype was determined. The crude annual incidence and prevalence of ILDs for our region were estimated. Subsequently, the primary estimates of the national annual incident and prevalent burden of ILD and its subtypes were calculated. Alternative estimates for each ILD subtype were calculated using the current and a large, previous Indian study (n = 1,084). Data were analyzed using SPSS version 22 and are presented descriptively. Results A total of 2,005 subjects (mean age, 50.7 years; 47% men) were enrolled. Sarcoidosis (37.3%) was the most common ILD subtype followed by connective tissue disease (CTD)-related ILDs (19.3%), idiopathic pulmonary fibrosis (IPF, 17.0%), and hypersensitivity pneumonitis (HP, 14.4%). The crude annual incidence and prevalence of ILDs were 10.1–20.2 and 49.0–98.1, respectively per 100,000 population. The best primary estimates for the crude national burden of all ILDs, sarcoidosis, CTD-ILD, IPF, HP, and other ILDs (in thousands) were 433–867, 213–427, 75–150, 51–102, 54–109, and 39–78. The respective alternative estimates (in thousands) were sarcoidosis, 127–254; CTD-ILD, 81–162; IPF, 46–91; HP, 130–261; other ILDs, 49–98. Conclusion In contrast to developed countries, sarcoidosis and HP are the ILDs with the highest burden in India.

The crude annual incidence of ILDs ranges from 1 to 70.1 per 100,000 population in different studies worldwide [4][5][6][7][8][9][10][11][12][13][14][15][16] while the prevalence lies between 6.27 and 97.9 per 100,000 population [4,5,10,13]. Most previous studies have not used the contemporary classification proposed by the latest American Thoracic Society/European Respiratory Society consensus statements [1,2]. Also, no study has reported the incidence and prevalence of ILDs from developing countries. In the developing world, non-communicable respiratory diseases remain underrecognized due to the high burden of infectious diseases such as tuberculosis [17]. There is an unmet need for epidemiologic data on ILDs from India, the world's second most populous country. Such knowledge can better inform national and international efforts for patient care and research in ILDs.
The spectrum of ILD subtypes at our center has been previously described [3]. Herein, we describe the incidence and prevalence of ILDs in our region located in northern India. The national incident and prevalent burdens of ILD and its subtypes have also been estimated using the current study and a previous large multicenter study from India [18,19].

Methods
In this study, data of subjects enrolled into an ILD registry at our Chest Clinic between March 2015 and February 2020 were analyzed retrospectively. The study protocol (Pulm653) was approved by the Institutional Ethics Committee, Postgraduate Institute of Medical Education and Research, Chandigarh, India. Written informed consent was obtained from all the subjects for participation in the registry. Consent was obtained from parents or guardians for the minors included in the study. We have previously published the data of a part of the study population included in the current study [3].

Subjects and study procedures
Subjects were enrolled into our ILD registry if they met all the following criteria: (i) age >12 years (adolescents and adults); (ii) diagnosis of ILD; and, (iii) willingness to provide informed consent. Subjects with any of the following were excluded: (i) final diagnosis of a disease other than an ILD; and (ii) lack of informed consent. The demographic details, spirometric measurements, the final diagnosis, and the dates of diagnosis and death were extracted from the registry data. The proportion of each ILD subtype was calculated.

Diagnosis of ILD and its subtypes
In our Chest Clinic, all subjects with a suspected ILD were referred to one author (SD) for inclusion into the ILD registry. A detailed history was obtained, including the symptoms, the risk factors for various ILDs, family history, history of exposures to cigarette smoke, drugs, other environmental dusts, and the presence of any connective tissue disease (CTD). A thin section (0.5-1.5 mm) computed tomography (CT) of the chest, spirometry, and serology for autoimmune diseases were obtained; further tests were guide by the suspected diagnosis. Lung biopsy or other invasive procedures were performed for obtaining tissue samples if indicated [20,21]. The diagnosis of the ILD subtype was made as described previously [3] using contemporary guidelines, statements, or expert opinions [1,2,[22][23][24][25][26][27]. In general, subjects with suspected sarcoidosis underwent transbronchial needle aspiration, endobronchial biopsy, and/or transbronchial lung biopsy. CTD-ILDs were diagnosed on clinical features, the detection of serum autoantibodies, and the presence of ILD on the chest CT. Idiopathic pulmonary fibrosis was mostly diagnosed on the presence of usual interstitial pneumonia pattern (definite or probable) on the chest CT. Hypersensitivity pneumonitis was diagnosed on a characteristic appearance on the chest CT and a definite history of exposure to offending antigens. In those with suspected IPF or HP, lung biopsy (mostly transbronchial lung cryobiopsy or surgical lung biopsy) was performed when the clinical or imaging findings were inconsistent. Wherever needed, the clinical, radiologic, and histopathologic data were reviewed by a multidisciplinary team comprising two or more pulmonologists, a radiologist, and a pathologist to assign a diagnosis. In general, patients were followed every 3-6 months. Information received on the death of any included patient was recorded.

Incidence and prevalence of ILDs in our region
The crude annual incidence and prevalence of ILDs were calculated for the Tricity region. Our hospital is located in this region that comprises the three districts of Chandigarh (a Union Territory), Panchkula (in the state of Haryana), and Sahibzada Ajit Singh Nagar (in the state of Punjab). The estimated population of persons above the age of 12 years (henceforth, referred to as the 'population') of this region was obtained from the 2011 national census data [28]. Study participants residing in the Tricity and diagnosed during the study period were designated as 'incident cases'. The crude annual incidence of ILDs per 100,000 population was calculated for each year (years 1-5) and the entire study duration (average annual incidence).
Next, the records of our clinic were searched for the reported deaths amongst Tricity residents. The study subjects or their next of kin were also contacted telephonically between March and April 2020 to obtain information on death or migration. Where the vital status of the subjects was unconfirmed, clinic records were searched for data on the radiologic features, lung function trends, the clinical condition at the last follow-up, and the visit pattern. Using this information, two authors (SD, RA) made informed assumptions on the vital status (alive or dead) of the subjects as of March 1, 2020. The point prevalence was then estimated on three different assumptions for defining the 'prevalent cases': (i) all the subjects with unavailable vital status were assumed dead; (ii) all of them were assumed alive; or (iii) the vital status was assigned using informed assumption. The proportion of each incident and prevalent ILD subtype was compared with another recent large (n = 1,084) study of ILDs in India (the ILD India registry) [18,19].
Subsequently, all incident (and prevalent) cases were divided into eight age-and-gender groups using four age intervals (13-39 years, 40-59 years, 60-79 years, and �80 years). Direct standardization was performed against the 2011 national population [28]. The crude incidence and prevalence of the major ILD subtypes (sarcoidosis, IPF, CTD-ILD, HP, and others) were also calculated; standardization was avoided due to small samples.

Calculation of burden of ILDs in India
Assuming the incidence and prevalence estimates for the Tricity to represent the entire country, the national incident and prevalent burden of ILD and its subtypes were calculated, based on the 2011 national population (primary estimates). To calculate the alternative estimates for the ILD subtypes, the average proportion of each ILD subtype from the current study and the ILD India Registry was multiplied by the overall national annual incident and prevalent burden of ILDs [18,19]. Finally, estimates of all epidemiologic indices were calculated assuming different referral rates of ILD patients to our clinic (ranging between 10% and 90%, at intervals of 10%).

Statistical analysis
Data were entered into worksheets using the computer program Microsoft Excel and analyzed using the statistical package SPSS version 22. Data are expressed as mean ± standard deviation (SD) or as number (percentage). Proportions were compared using the chi-squared test. A pvalue of less than 0.05 was considered to reflect statistical significance.
Of the 517 subjects residing in the Tricity region (Table 4), 409 were incident cases. Amongst incident cases, the proportions of all ILD subtypes except CTD-ILD were different between the current study and the ILD India registry. Among prevalent cases, the proportions of sarcoidosis and HP were different between the two studies ( Table 4). The Tricity region's population for individuals >12 years of age was 2,028,557. Accordingly, the crude annual incidence of ILDs per 100,000 population for the five successive years of our study period was 4.29, 3.94, 3.89, 4.63, and 3.40, respectively, yielding an average of 4.03 (Table 5). For the age groups 13-39, 40-59, 60-79, and �80 years, the respective estimates for annual incidence (per 100,000) for men were 1.39, 5.11, 13.13, and 7.94, respectively, while for women, these were 1.58, 9.06, 14.13, and 1.67, respectively.
A total of 380 Tricity subjects were alive, 100 had died, five had migrated, while the vital status remained unknown for 32, as on March 1, 2020. The total number of prevalent cases of ILDs in the region were 412, 380, and 398 based on whether the 32 subjects with unknown vital status were assumed to be alive, dead, or assigned a status using the best assumptions, respectively. The crude prevalence of ILDs in the region according to the 'best assumptions on vital status' method was 19.62 cases per 100,000 population. Assuming 20-40% referral rates to our center, the estimated crude annual incidence and prevalence were 10.1-20.2 and 49.0-98.1, respectively, per 100,0000 population (Table 5). Accordingly, the estimated standardized national annual incident cases of ILDs ranged between 92,646 to 185,293 cases, while the national (prevalent) burden was estimated at 448,060 to 896,120.

Discussion
The estimated crude annual ILD incidence and prevalence in our region (per 100,000 population) were 10.1-20.2, and 49.0-98.1, respectively, while the standardized national prevalent burden was 0.45-0.89 million. To our knowledge, this is the first study on the incidence,  ANCA-antineutrophil cytoplasmic antibody, CTD-connective tissue disease, CVID-common variable immunodeficiency, Ig-Immunoglobulin, ILD-interstitial lung disease, IPF-Idiopathic pulmonary fibrosis, LIP-lymphocytic interstitial pneumonia, NOS-not otherwise specified. The p values are derived by applying the chi-squared test for the difference in proportions for each ILD subtype between the current study and the ILD India registry study for incidence and prevalence (by the best assumptions method).
https://doi.org/10.1371/journal.pone.0271665.t004 prevalence, and burden of ILDs from a developing country. It is also the largest single-center experience of the spectrum of ILDs diagnosed using contemporary guidelines. Our primary estimates were derived from prospectively collected data in a hospital-based registry. Our hospital is the largest referral center in the region north of the national capital offering specialized care for sarcoidosis and other ILDs. Yet, it is expected that not all patients Alt-alternative estimates of, CTD-connective tissue disease, HP-hypersensitivity pneumonitis, ILD-interstitial lung disease, IPF-idiopathic pulmonary fibrosis.
All values for prevalence are per 100,000 and those for incidence are per 100,000 population per year. The prevalence was calculated based on best assumptions on the vital status for subjects with unknown status on March 1, 2020. The alternative estimates were prepared by averaging the proportion of each ILD subtype from the current study and the ILD India Registry study. The values in bold font provide the range based on our best assumptions of the referral rates.
in this region would have registered with us. In a survey, it was found that about 80% of the primary physicians in our region referred suspected patients with IPF to higher centers [29]. This region has two other major public sector hospitals, five large private hospitals, and several independent private clinics providing care to ILD patients. Other potential factors hampering enrolment into our registry are misdiagnosis at the primary level (such as sarcoidosis and HP wrongly diagnosed as tuberculosis, and IPF as chronic obstructive pulmonary disease), patient hesitancy to seek tertiary care, and patients with sarcoidosis and CTD-ILD being treated by rheumatologists and internists. Therefore, we estimated tentatively that about 20-40% of the ILD patients from the region got registered at our clinic. The alternative estimates for the ILD subtypes derive from a larger dataset including the current study and a large multicenter study of 1,084 subjects from different regions of the country, and thus may be more representative [18]. Our best estimated crude annual ILD incidence (10.1-20.2/100,000) lies within the overall range (1-70.1 per 100,000 population) reported in other studies (Table 7). It is close to that reported in one of the most well-performed studies of ILD epidemiology in recent times from Greater Paris, France (19.4/100,000) [13]. To our knowledge, ILD prevalence has been reported by only four previous studies and ranges from 6.3-97.9 per 100,000 population [4,5,10,13]; our estimates (49.0-98.1/100,000) fall on the higher side of this range ( Table 7).
The standardized annual ILD incidence in the current study (10.5-21.0/100,000) is about 10-20 times lower than that for tuberculosis in India (199/100,000) [30]. Moreover, the national burden of ILDs (0.45-0.89 million) is about 90 times lower than that of chronic obstructive pulmonary disease (55.3 million) and about 60 times lower than that of asthma (37.9 million) [31]. Even for allergic bronchopulmonary aspergillosis, a less common respiratory disorder, the best estimated total national burden is 0.86-1.52 million, about twice that of ILDs [32]. With a population prevalence of less than 10/10,000, the ILDs even as a single group remain rare disorders [33]. However, the total of 0.45-0.89 million cases represents a significant disease burden at the national level. The alternative estimates suggest that sarcoidosis (127.2-254.5 thousand cases) and HP (130.5-260.9 thousand cases) have a particularly significant presence in the country. The remarkable burden of ILDs estimated in this study might sensitize government and non-government healthcare agencies towards greater resource allocation for these diseases.
The ILD spectrum in India remains contentious. In our previous study, sarcoidosis (42.2%) was the commonest ILD (n = 803), followed by IPF (21.2%), CTD-ILD (12.7%), and HP (10.7%). The present analysis, which includes the patient population of our previous study, reveals a slightly different spectrum. Though sarcoidosis (37.3%) remains the commonest, the second most common ILD is CTD-ILD (19.3%) instead of IPF, which is placed third now (17.0%). The proportion of HP is slightly higher at 14.4%. These differences might result from changes in referral practices, better awareness, and improved use of various diagnostic techniques. The current spectrum still differs from the ILD India registry, where HP was the most common ILD subtype (Fig 1) [18]. This study has a few limitations. The estimates draw on several assumptions including the vital status of subjects with missing follow-up data, referral rates, and uniform ILD incidence across the country. We used the 2011 national census data as the most recent available resource for the population estimates. This might be inaccurate for our study period owing to population growth. Therefore, we have provided broad estimate ranges considering different referral rates and presented alternative estimates to account for the different ILD spectrum in the ILD India registry. Our estimates are thus crude and tentative approximations like the 'Fermi estimates' [32,39]. Such estimates provide rough assessments, that can vary by a one-log precision. Even rough estimates are potentially valuable as they may guide future investigations, especially community-based studies. Our study's strength is that ILD diagnosis was made at a referral center by an experienced team following the latest diagnostic standards.
In conclusion, the overall incidence and prevalence of ILDs in India are like those found in the developed world. However, sarcoidosis and HP have the highest prevalent burden according to the alternative estimates, contrary to the findings from developed countries. Despite being rare, the ILDs represent a significant disease burden. Population-based, multicenter studies from different geographic regions are required to better define the epidemiology of ILDs in India.

Acknowledgments
SD: conceived the article, data collection and analysis, drafted and revised the manuscript, is the guarantor of the content of the manuscript, including the data and analysis ISS: data collection and analysis, drafted and revised the manuscript RA: data collection and analysis, drafted and revised the manuscript VM: data collection and analysis, drafted and revised the manuscript KTP: data collection and analysis, drafted and revised the manuscript SK: data collection and analysis, drafted and revised the manuscript MG: data collection and analysis, drafted and revised the manuscript AB: data collection and analysis, drafted and revised the manuscript ANA: data collection and analysis, drafted and revised the manuscript DB: data collection and analysis, drafted and revised the manuscript Methodology: Sahajal Dhooria, Ritesh Agarwal, Valliappan Muthu, Kuruswamy Thurai Prasad, Mandeep Garg.